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Standard thermodynamics is associated with the von Neumann entropy. There has recently 
been great interest in making thermodynamics more suitable for the quantum and nano-regime by 
taking inspiration from single-shot information theory, which replaces the von Neumann entropy 
with the more general smooth entropies. These reduce to the von Neumann entropy in the regime 
of asymptotically many identical and uncorrelated systems, the von Neumann regime. We propose 
generalisations of the laws of thermodynamics which analogously reduce to the standard laws in that 
same limit. We motivate these by analysing a very generic Szilard-engine model. We derive how 
much work one can extract in a single extraction as a function of the success-probability. Neither 
the initial nor final state is required to be thermal. In the appropriate limits we recover existing 
results. 



General introduction. — There has long been a 
productive cross-fertilization between information theory 
and statistical mechanics. For example, Jayne's maxi- 
mum entropy approach to statistical mechanics was heav- 
ily inspired by Shannon information theory [T]. In the 
other direction, von Neumann's quantum entropy was 
originally motivated through a thermodynamical argu- 
ment, and is now the canonical measure of information 
in quantum information theory [2]. 

A recent development in information theory is the gen- 
eralisation of the standard Shannon information theory, 
classical as well as quantum, to something tentatively 
called smooth entropy information theory. To our knowl- 
edge this first appeared in [31 H]. The generalisation 
turned out to be necessary in the context of quantum 
cryptography. The problem with the standard theory 
is that the Shannon entropy (von Neumann entropy in 
the quantum case) only has the operational interpreta- 
tions desired in a particular limit of extremely large num- 
bers (asymptotic) of identical independently distributed 
states (i.i.d.). In cryptography one cannot allow oneself 
to make strong assumptions like this. In contrast the 
entropy(ies) used in the smooth entropy approach have 
the desired operational meanings for all states, including 
finite sized, correlated states and for single realizations 
of protocols, not just on average — thence the alternative 
name for the approach: single-shot information theory. 
In the appropriate limit the smooth entropy approach 
reduces to standard Shannon approach. 

Given the history of fruitful cross-fertilization between 
the fields it is natural to consider whether this new ap- 
proach to information theory is useful in statistical me- 
chanics. To our knowledge this was first considered in [5]. 
The focus was on the relationship between information, 
quantified by entropy, and work. This relationship has 
been the centre of much intriguing, and arguably very 
productive debate, c.f. Maxwell's daemon [IHH], Szilard's 
engine [9], Landauer's erasure [10], and Bennett's re- 
versible measurements jllj . Particularly important for 



j5] and our considerations here is the notion of a Szilard 
engine [9 and Bennett's extensions thereof [TT]. It was 
shown in ^ that one can use smooth entropy to quantify 
the extractable work in a Szilard engine, making the ex- 
pressions much more generally applicable than the corre- 
sponding Shannon entropy expression. This has been fol- 
lowed by several further results. In |12j it was shown how 
to interpret negative conditional entropy in these settings 
([5] does not deal with conditional entropy). Very re- 
cently, in |13j and independently [TT the non-conditional 
case is considered in a significantly more sophisticated 
and general manner than [S]. These articles taken to- 
gether give much hope that a neat and greatly generalised 
statistical mechanics, tentatively dubbed single-shot sta- 
tistical mechanics, is emerging. A key advantage with 
this approach is that one can answer questions such as 
"how much work can I extract in any given go (single shot 
extraction) with a probability x of success" . In standard 
thermodynamics one is only concerned with averages. 

In this Letter we take this much further. We give an 
expression for the extractable work in a single extrac- 
tion which reduces to the expressions of [3 [131 HI! '^^ 
appropriate limits. We take the system constituting the 
working medium of the Szilard engine to have a given 
but arbitrary set of energy levels. Moreover it has an 
arbitrary probability distribution over these levels, rep- 
resenting our knowledge thereof. This corresponds to the 
eigenvalues of a density matrix p taken to be diagonal in 
the energy basis. Similarly we take the final energy and 
probability distributions to also be fixed but arbitrary. 
(In [131 m] this is taken to be a thermal state on the 
same energy levels.) Our finding is that the maximal 
work that can be extracted given these initial and final 
conditions is determined by an apparently novel measure 
of how much a distribution majorizes another. We arrive 
at the result partly by combining single-shot concepts 
with techniques from [1.5 18^ which have to our knowl- 
edge not previously been applied to this setting. We then 
use our result to generalise the laws of thermodynamics. 
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with particular emphasis on the second law. 

We proceed as follows. We firstly briefly review some 
relevant definitions and existing results relating to single- 
shot statistical mechanics. We then define the setting we 
consider here, the work extraction game. We proceed 
to build up to our main statement by defining the novel 
majorisation measure in question, as well as an important 
tool we will adapt from ^15j il8j . We then give the 
main statement: how much work can one at most extract 
in the game given the initial and final conditions. A 
sketch of the proof is subsequently provided, as well as 
the actual proof which is in the technical supplement. 
Finally we discuss the differences to previous results and 
what the result tells us about the structure of single-shot 
thermodynamics. 

Single-shot statistical mechanics, relevant key 
results. — We now briefly review key results that we shall 
later recover as special cases of our expression. (This is 
thus not an exhaustive list of all previous results). More 
specifically, the details of the models of work extraction 
in the different papers are not a priori identical, but we 
shall recover the same expressions for the extractable 
work within the model here. 

In an expression for the extractable work given an 
n-cylinder (part) Szilard engine was given as 

W^^ = (n-i/.^_)/cTln2. (I) 

Here is the work that can be extracted for certain 
except with probability e. H^^^ is the smooth max en- 
tropy of the density matrix representing the agent's initial 
knowledge about the state of the working medium. This 
is defined as Hf^^^{p) ~ log {Supp^ (p)) , with Supp^{p) 
the support of p minimised over all states within e trace 
distance of p. (Actually there is an alternative definition 
as well but they are both known to coincide up to an ad- 
ditive log i term, so for simplicity we mention only one 
definition here). T is the temperature of the heat bath, 
and k Boltzmann's constant. 

The more recent papers [131 [13] use work extraction 
models that are not a priori equivalent to each other, 
but the result for the extractable work recovered is the 
same in both papers. A key result obtained indepen- 
dently in both models is that given an initial state p and 
a final thermal state pT over the same energy levels, the 
work that can be extracted with certainty up to e failure 
probability is: 

W ^kT\n{2)Dl{p\\pT), (2) 

where p is taken to be diagonal, pT the corresponding 
thermal state on the same energy levels, and Df^{p\\pT) is 
the e-smooth relative entropy of order (see tlSj). This 
reduces to = kThi2D{p\\pT) for the standard rela- 
tive entropy in what we shall here call the von Neumann 
regime, where p = t®", n — >■ cx) and e — ^ 0. That lat- 
ter expression is well-established, see e.g. [20]. Moreover 
Eq. [2] reduces to Eq. [T] in the case of degenerate energy 
levels, as shown in [IB] . 



The Virork extraction game. — The game has sim- 
ple but minimal rules. (It will nevertheless not be trivial 
to analyse as there is a multitude of different strategies 
one may choose). There are three systems and a work- 
extraction agent. One system is the working medium, 
another is a heat bath of temperature T, and the last 
is the work reservoir. The agent wishes to transfer as 
much energy as possible into the work reservoir in a sin- 
gle extraction. We shall be concerned with quantifying 
how much energy can be transferred with certainty up to 
probability e, calling this the work, . 

The initial energy spectrum {E} of the working 
medium is arbitrary. The initial density matrix p of the 
system is diagonal in the energy basis. The final energy 
spectrum {i^} and diagonal density matrix a are also ar- 
bitrary. Whilst the initial and final energy spectra are 
arbitrary it should be noted that will turn out to 
depend on them, so they must be specified in order to 
calculate . The agent has a few elementary processes 
it can combine in any way it chooses: (i) it may couple 
the working medium to the heat bath. This has the ef- 
fect of changing the probabilities (not the energy lev- 
els) in such a way that they approach, by some amount, 
the Gibbs thermal state for the given energy spectrum 
(wherein p{Ei) — exjp[—Ei/kT]/Z)] (ii) it may change 
the energy levels (without altering the probabilities) by 
external intervention taking {E}j to where j la- 

bels the time step. Here the energy must be accounted 
for by being taken from or given to the work reservoir. 
In a given realisation the system is in one, possibly un- 
known, energy eigenstate and only changes to that partic- 
ular eigenstate cost or yield work; (iii) It may permute, or 
equivalently relabel, the eigenstates. The combination of 
these elementary processes the agent chooses is called it's 
strategy. Energy conservation is assumed throughout, in 
particular any energy leaving or entering the system must 
enter or leave the heat bath and/or the work-extraction 
system. 

Relative mixedness. — We now introduce a measure 
of how much more mixed one state p is than another, cr, 
calling this the relative mixedness M{p\\a). The defi- 
nition of Af(p||(T) will later be justified by operational 
statements we will make. 

Definition 1 (Relative mixedness). The relative mixed- 
ness of two states p and a with compact support and spec- 
tra f{xi), g{x2) respectively is given by 

Af(p||(T) := maxm : / f{xi)dxi> / g{x2)dx2yi, 
Jo Jo 

(3) 

For states with discrete spectra {Xi} one evaluates M for 
the associated step-function where the i-th 'block' of con- 
stant height has height Xi and all blocks have width 1. 

The logarithmic relative mixedness lnM(p||CT) (the In 
will turn out to make the interpretation neater) can be 
viewed as a measure of how much the spectrum of p ma- 
jorises that of a. If m is set to I the expression optimised 
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over in the definition in M reduces to the definition for 
whether (the spectrum of) p majorises a, p >- a. We 
shall discuss more relations between M and majorisation 
later in the Letter. 

Gibbs-rescaling. — Wc now describe a powerful in- 
sight from [T5l - fT7] which we employ here to bridge a par- 
ticular gap between information theory and statistical 
mechanics: the fact that the former does not care about 
energy. In information theory, the Shannon/von Neu- 
mann entropy of a state, — 'YliiPi logPi is independent of 
the energies of the states involved. As the extractable 
work is expected to depend on the energy levels involved 
it follows that it is not expected to be uniquely deter- 
mined by an entropy (or any other quantity that is in- 
dependent of energy) . A key way in which energy enters 
into statistical mechanics is that in a Gibbs state the 
probability of any given energy eigenstate state with en- 
ergy E is given by pt{E) = ?'^p1^j»2i1^ where Z is the par- 
tition function. The insight we adapt from [T51 - [T7] is that 
we can take this bias into account by what essentially 
amounts to rescaling the density matrix's eigenvalue dis- 
tribution by pt{E). After this rescaling the occupation 
probabilities will turn out to uniquely determine our ex- 
pression for the extractable work. More specifically, we 
shall be employing an operation we term Gibbs-rescaling 
to the eigenvalue spectrum. Firstly we transform the 
spectrum {A^} into the associated step-function. Then 
we take each block, rescale its height as Xi ^fe- , and 



its width I = 1 1-^ e^'kt such that the area of the new 
block is Xi as before. We write this operation applied to 
a density matrix p as G^{p). 

The main theorem. — Having defined the relative 
mixedness M(.||.) and Gibbs-rescaling G'^{.) we are now 
ready to give the main statement. 

Theorem 1. In the work extraction game defined above, 
if one is given an initial density matrix p = Xi\ei){ei\ 
and final density matrix a = J2j '^j\fj)ifj\ with {le^)}, 
^he respective energy eigenstates and both p and 
a having finite rank, then the work one can extract 
with certainty except with e probability respects 



W < fcTln M 



1-e 



IIG^(-) 



The proof of the statement is in the technical supple- 
ment. A rough intuition is that the Gibbs-rescaling is 
needed to take into account the bias imposed on the en- 
ergy levels by the Gibbs statistics, and up to that bias, 
only the amount of majorisation matters as the work ex- 
traction process lowers the majorisation amount, the rel- 
ative mixedness. 

Figure 1 gives an example of a simple application of 
the theorem. 

We now discuss the implications of the statement and 
develop the argumentation further. Firstly we consider 
how to generalise the laws of thermodynamics for them 
to be defined and correct in our model. 
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Figure 1: Case of initial and final states being Gibbs states. 
The Gibbs-rescaling takes a Gibbs state with partition func- 
tion Zp{Zq) to a uniform distribution p{q) of width Zp{Zq) 
and height (upper graph). The integrals pj and 

qj (lower graph) are used to evaluate the relative mixedness. 
Consider firstly e = 0. m defined in the relative mixedness 
definition must satisfy pj{ml) > qj{l) Vi, and we see this 

holds for m < ^ , implying that W° < fcTln ^ . For the case 

PL 



oi e ^ Pj- is replaced with and one sees m < ^ (i-e) • 
Thus in this case < kT In ^ + kT In ■ A special case 
of the above corresponds to a generalisation of Landauer's 
erasure principle, as we can take the initial state to be two 
thermalised degenerate levels at energy (i.e. one unknown 
bit) and the final state to be thermalised but with one energy 
level at and the other extremely high (i.e. a bit taking one 
value only) . Then we see < - fcT In 2 -I- fcT In , so that 
work would have to be invested if e is suitably small. (The re- 
verse direction is also possible, corresponding to a single-qubit 
Szilard engine). 



0*"^ and 1^* laws of thermodynamics. — The O"^ 

law can be stated as: There exists for every thermody- 
namic system in equilibrium a property called tempera- 
ture. Equality of temperature is a necessary and suffi- 
cient condition for thermal equilibrium. This also holds 
after our generalisation. In particular we are still assum- 
ing heat baths that take the working medium closer to a 
Gibbs thermal state upon interaction. 

The first law however is more subtle as it involves 
distinguishing between heat and work and we do gener- 
alise the definition of work and accordingly that of heat. 
The law is normally stated as dU = dQ — dW where 
U = tr{pH) is the internal energy of the working medium 
with Hamiltonian H, Q is 'heat' and W 'work'. The es- 
sential idea behind the splitting of the internal energy 
into two terms in such a manner is that one imagines 
three systems: a heat bath, a working medium, and an 
energy storing system (e.g. a weight that is lifted or an 
atom that is excited). Energy transfers to/from the work- 
ing medium from the heat bath are called dQ and those 
to/from the storage system dW. Our model preserves 
this distinction of three systems and by assumption en- 
ergy conservation of energy holds in any given extrac- 
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However we 

are going beyond talking about the average internal en- 



4 



ergy of the working medium and make statements about 
how this energy changes in individual extractions. Our 
generahsation of the first law can be stated as 

In any given extraction, with p> I — e 

dE,y, = -dEbath - dW - dErest = dQ ~ dW\ (4) 

Note that we are breaking the energy change of the work 
reservoir into two parts, dW^ and dErest- The idea be- 
hind this is that only predicted energy transfer should 
count as work. One may for example imagine buckets 
lifting water out of a mine up to a certain height (or as a 
quantum example an atom excited up to a certain level) . 
The height at which the buckets are tipped into a reser- 
voir is specified in advance. If they go higher than this, 
the extra potential energy will be transferred to other 
degrees of freedom associated with the reservoir system, 
e.g. into movement of the water. We call this extra dErest 
energy wasted and group it with the generalised heat 
^dQ = dEhath + dErest- This second term is not present 
in the von Neumann regime (asymptotic i.i.d. with van- 
ishing e) as there the work extraction is deterministic. 

Second law. — Consider the so-called Kelvin state- 
ment of the second law: 

No process is possible in which the sole re- 
sult is the absorption of heat from a reservoir 
and its complete conversion into work. 

This law, or more specifically a natural quantitative 
generalisation of this law, holds also in our more gen- 
eral setting. We show in the appendix that for given 
states of the working medium A and B respectively, 
W^{A ^ B) + W^B ^ A) < W^^iA ^ A).We caU this 
the 'triangle inequality'. It implies the following general- 
isation of the second law. 

m— 1 

^ W {A, ^ < W^^'iA ^ A) if Ar^ = Ai. (5) 

i=0 

If e = 0, which would be the case if one wants the system 
to be reset at the end with certainty, then the right- 
hand-side of the inequality is 0. Note that one may still 
get work out in a single cycle by letting e depend on i. 
One could e.g. take a larger risk of failure when resetting 
compared with extracting and then get net work out from 
a cycle if successful. 

The second law is also closely related to entropy in- 
creasing with time and one may wonder what the corre- 
sponding generalisation of the statement is. A particular 
standard expression is that 

A{S~13{E})>0 (6) 

where S and (E) are the von Neumann entropy and ex- 
pected energy of a system interacting with a heat-bath 
with inverse temperature /3. (A indicates the change in 
these values during the interaction.) This actually still 
holds in our more general model; we show this in the 



technical supplement. However, crucially, it is not suffi- 
cient to guarantee that an evolution is possible. Instead 
it should be replaced by the statement that a state change 
p ^ p' due to a thermalisation with a heat-bath at tem- 
perature T is possible if and only if 

ln(M(G^(p)||G^(p')))>0. (7) 

This is a significant strengthening of the restriction on 
second law-type entropy increase. There are processes 
that respect Eq. [6] but violate Eq. [7| A simple ex- 
ample is to consider degenerate energy levels so that 
A(i5) = 0, and take three levels with probabilities 
(1/2 1/2 0)^ ^ (2/3 1/6 1/6)^. Then AS" w 0.25 but 
ln(M(G"'"(p)||G"^(/o'))) is negative so this evolution is not 
possible according to our model. The inequivalence of 
entropy and majorisation displayed here has been previ- 
ously noted in the context of the second law [T5lll6j . The 
reason this has not received more attention to date is pre- 
sumably that in the von Neumann regime this inequiv- 
alence disappears. More precisely, if we take a tensor 
product of n identical states each with von Neumann en- 
tropy S and let n — >■ oo, then with asymptotically small 
error we may approximate the distribution over states 
as a uniform distribution with value 2~"'^ in the range 
[0,2""'^]. Increasing S now gives a flatter and wider uni- 
form distribution, majorised by the initial distribution. 

The fact that Eq. [7] can be experimentally violated 
without violating the standard expression, implies an in- 
teresting potential test of our theory. More specifically, 
in our theoretical analysis Eq. [7] follows as a mathemat- 
ical consequence, so the experimental question concerns 
whether our model is appropriate to describe physical 
systems interacting with their environment. 

Recovering existing results. — Eq. [2] and accord- 
ingly [T] are special cases of our main result. Eq. [2] corre- 
sponds to the case where the final state pT is demanded 
to have the same eigenspectrum and be a Gibbs state 
(pt = ^VT{Ei)\ei){ei\)). We show in the technical sup- 
plement that Eq. [2]is indeed recovered in that limit. (It 
was also shown in Fig. 1 that Landauer's erasure princi- 
ple is recovered in a simple manner.) 

Outlook. — It is particularly important to test the re- 
sults experimentally in non-equlibrium quantum systems, 
as outlined above. On the theoretical side one should also 
consider the maximal risk-taking type of strategies of [S] 
and the case of quantum side knowledge as in [H]. Fi- 
nally we note the striking similarities between what is 
discussed above and the question of quantifying entan- 
glement in the non-asymptotic regime. It was shown in 
a seminal paper by Nielsen ,21j that majorisation is the 
central quantity there and we anticipate that many of the 
qualitative and quantitative results from our work can be 
applied also in that context. 
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I. TECHNICAL SUPPLEMENT 

We first give certain definitions and lemmas needed to 
prove the main theorem. 

Definition 2 (Gibb's rescaling). Consider a density ma- 
trix p = Aj|ej)(ej| with eigenvalues {Xi}"=i and 
take the energy eigenstates of the system to be {\ei)}f^-^ 
with energies {Ei}f^i respectively. There is an associ- 
ated step function for the spectrum. X{xn) = X^xn] where 
X e (0, f ] . Similarly there is an energy step function 
E{xn) = -E|-a;„] where x e (0, 1]. The Gibbs rescaling as- 
sociated with temperature T combines X{xn) and E{xn) 
to a new function G^{y) given by 



G' 



e ''T dy 



X 



It follows that G'^(y) is defined on (0, Z], with 
Z — Y^^=i §t) the partition function. More- 

over G^{y) is a probability distribution satisfying 
J,^G^{y)dy = l. 

Definition 3 (Notation), s € {0, 1}™ .• path variable 
for a game with m work extractions (subsequently called 
"steps"): Sj = f ; system is in chosen states for work 
extraction 

Sj = O.' system is not in chosen states for work extraction 



1 <^ Sn 



and 



Sj is the complement of Sj: Sj 

Sj = <^ Sj = 1 

w'L: logarithmical work (kTln{wg) = Wi) one extracted 
in the step j on the path s. 

: The logarithmical work one tries to get out in the 
step j . 

w: total logarithmical work demanded in order to call the 

total extraction successful. 

ryi; probability of doing step j on the path s. 

Ps : total probability of success: Ps = J2 Ylv^s- 

seG j 

(f)~: state of the system after step j if the previous evolu- 
tion of the system is given by the path s 
X'l: eigenvalues of state <fP^. 

pL = G (j^'gj ■ Gibb 's resettled probttbility distribution af- 
ter step j (before thermalizing) conditioned on the previ- 
ous steps on path s. 



E^{x): Energy of the level labeled by x after step j. 
Qu{x): Step function on U : 



Qu{x) 



for X & U 
else 



Definition 4 (Block). For a <b the interval (a, b] is said 
to be a block corresponding to a level k, if pL is constant 
on this interval Vs. 

Definition 5 (Thermalisation). If after step j one 
choses to do a thermalization one gets no change in the 
work reservoir and the Gibb's resettled probability after- 
wards is given by: 



(4) 



where Bj{.) is associated with a bistochastic matrix Bj, 
acting on the vector p^; = p'g{a,j~), where (afe_i,a/c] with 
ttk-i < ttk are blocks corresponding to levels k with ak — 
flfe-i = ai — aj_iVfc, Z. pP^ ^ is then defined by pi ^(x) = 
BjPk for X e (afe_i, ak]. Bj does not depend on the pttth 
s. 

Note that a permutation matrix is bistochastic so one 
may permute the levels in the thermalisation step. Wc 
use this to simplify the notation of the work extraction 
definition. We take the levels lowered or raised to form 
the first I levels. 

Definition 6 (Work extraction). To do a work ex- 
traction in step j one first defines an interval (0, a] = 
I 

y (a/c-i, afe], where {ak-i,ak] correpond to levels 
fe=i 

{1,...,^}, on which one wants to change the energy by 
AE = — fcTln . The remaining levels remain 

untouched. 

For X G (0, 1], the eigenvalues of the levels after step 
j , conditioned on the previous state are given by: 

In the case Sj = 1 (state of the system is found to be 
in the levels corresponding to (0, a]): 



^L,=i (M) = e(o,a](2i-ia;) 



•,=1 



Gibb's resettled probability distribution after step j where r]i^g.^i ~ / -^i* ^ ( ) 



Ps,i 

(after thermalizing) conditioned on the previous steps on 
path s. 

q: final Gibb's resettled probability distribution, condi- 
tioned on successful work extraction: 



In the case Sj = (stttte of the system is not in the 
levels corresponding to {0,a]): 



seG 



n4 

i 



4s,=0 (M) = Q{a,Z,_,]{Zj-lX 



X^'i\xn^) 



=0 



Bj : Bistochastic m,atrix one chooses after step j by ther- 
malizing the system (this has to be the same for all paths). 



xn 



has no change in the work reservoir: wl, „ = f . 



da;. In this case one 

j 

s\6 



=0 
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As mentioned in the main body, our work extraction 
game is essentially to combine the elementary processes 
of thermalisation and work extraction in any sequence 
in order to extract as much work (or input as little as 
possible) whilst going from a defined initial state to a 
defined final state. 

Definition 7 (Our work extraction game). There are 
three systems and a work-extraction agent. One system 

is the working medium, another is a, heat bath of temper- 
ature T, and the last is the work reservoir. 

The initial energy spectrum {E} of the working 
medium is arbitrary but given. The initial density ma- 
trix p of the same is diagonal in the energy basis. The 
final energy spectrum {F} and diagonal density matrix a 
are also arbitrary but given. 

The agent can combine thermalisation (defined above) 
and work extractionfalso defined above) in any sequence. 
This sequence, together with the specifications for each 
step is called its strategy. 

A given work extraction will transfer some energy v to 
the work extraction reservoir. Before the extraction the 
agent must specify W. If v > W the work extraction 
is termed successful. The probability of success is called 
1-e. 

We shall be interested in bounding W given e and the 
initial and final conditions. We break the calculation into 
several lemmas which will later be combined to prove the 
main theorem. 

The following lemma is relevant for quantifying how 
much flatter the Gibb's rescaled distribution is after a 
work-extraction. 

Lemma 2. For x G (with Zj the partition func- 

tion after step j), by doing a work extraction in step j, 
the Gibb 's rescaled probability distribution after the step, 
conditioned on the previous steps on path s are given by: 
In the case sj = 1: 




One easily sees that p'ix) = for a; > ^. 

The proof for the case Sj = is analogous. □ 



In the case sj = 0; 
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The next lemma is in contrast concerned with how 
much wider the Gibb's rescaled distribution is after the 
s\sj=i s\sj=i work extraction (recall that the distribution has support 

on (0, Z] where Z is the partition function). 

. ^ , . Lemma 3. The partition function Zj immediately after 

I \^x- aw||^ _i + aj step j is given by: 



Zj = Zj-i + awi — a 



Proof. Case Sj = 1: 

Let w = Wg, p^ = Pg, p^~^ = p'L~ , X> = Xg and r]^ = r]i Proof. Let (0, a] be an interval consisting of blocks cor- 
Let X G {0,aw]f\{0,Zj] and b G (0, cxd) such that responding to the levels {1, ...,/} and n such that it can 
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split the interval (a, Zj] into n — I blocks. 



k 
I 



e >'T 



k V — ^ I 

e + y ^ e ''T 

k=l k=l+l 
k=l k=l + l 

E k 
e~kr- 

fc=i fe=i+i 



e - 



= wia— Q+y^e + 

fc=l fc=/+l 
= + awi — a 

out of which the lemma follows. 



□ 



We now combine the two previous lemmas to gain an- 
other relation between the Gibb's rescaled distribution 
and steps j and j — 1- We shall use this later in an it- 
erative manner to relate the very first and final Gibb's 
rescaled distributions. 

Lemma 4. The Gibb 's rescaled probability distributions 
at steps j and j — 1 respectively satisfy the relation 



}sj=k 



/c=0.1 



with constants cL, 



and c 



s\sj —0 



aw-' — a. 



Proof. Let Wk - w^^^_^^, pi = p'g^^^^^, p' ^ 



Vs,t ' 



> 



se{o,i}J ^ 



j+i j+i j+i. 



where is the permutation, which maximizes the left 
hand side, while t^^^ is the one which maximizes the 
right hand side. 

n J- T -1- 7 + 1 7 + 1 7 + 1 7 + 1 

Proof Let pi = pl ^, po = P^So,t^ Vi = vi , Va = , 



w = w- 



J+1 



s6{0,l}J 

= / {rjiwri o pi[xw) + rjfiTi o pq[x + aw — a)^ A X 

h 

y^ j rjiwf o pi[xw) 



m = ^rs\s,^k 

Let Co = awi — a and ci = and c||^ ^^ = Cj. Then: 

Vo'WoPoi^Wn + Co) + T]iwip{{xwi + ci) 
= VoPoix + awi — a) + ?7iWipj(xu'i) 

= Q(awi,Zj]{x + awi - aV^^(a;) + Q[Q.awi]{xwi)p'~'^ {x) 

= Q(a,Zi-awi+a]{x)p'^^ [x) + 0(o,a] (a: (x) 



□ 



Lemma 5 (Induction step). Let j e {!,..., to}. 
Let ; e (0,^j]. Let ^ £ {0, f De- 



Se{Q.i}i 

-I- / ?7oT o po(a; + aw — a) dx 

s-6{o,ip i 



Where f reorders € {0, l}-'pi in descending order in 
(0, aw] and G {0, l}^po in (aw, Zj]. This is possible 
since pi and po have disjoint support, also for different s, 
since a in definition |6] has to be chosen indepentantly of 
the path. (See lemma [2|. 

li S (0,min(a, Z)] is a value which maximizes the right 
hand side of the last line. Using the same argument back- 
wards after changing variables, we get: 



se{o,i}j 

li l+aw — lx 

= ^ j rjiwf o pi(xw) + ^ J riofopQ(x)dx 



se{o,i}j 



s6{0,l}3 



y^ J {^rjiWT^^^ o pi(xw) + rjoT^^^ o pq[x)) dx 



s-e{o,i}3 



/iTie Si — 
So = (si, . . . ,Sj,0,si, 



(si, . . . , , 1, s'j^, . . . , s^j_ -^j^) BTic? Applying any bistochastic matrix B on the probabilities 



Then: 



Pq and pi and reordering in descending order with r, 



j+i 
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afterwards, we get: 



afterwards). Inductively using lemma [5] one gets: 



p{x) dx 



sG{0,l}i 
/ 

> j tI + ^ o B o (t'+^)-^ 



p%{x)dx 







se{o,ip 

/ (^wrjiT^^^ o B o pi[xw) 
se{o.iV { 

+rioT^^^ o B o pQ(x)j dx 



?G{0,1}'" ^ \j = l / 



se{o,i}J 



Where the inequality follows out of the inequality Bp >- 
p for any bistochastic matrix B and vector p, which is 
proved in [22]. □ 



Iw 

= Ps j q[x)dx 



Therefore (with 1 — e): 

W = fcrin(u;) 



< kT In max ■ 



= fcTln 



m 


/ 







G(p) 


1 - 


- e 



;?(xi)dxi > / (1 — e) q{x2)dx2 V/ 



Theorem (Main, called Theorem [T] in main body). In which proves the main theorem. 

the work extraction game defined above, if one is given an 
initial density matrix p — K\ei){ei\ and final density 
matrix a = {le*)},. ilfj)} the respec- 

tive energy eigenstates and both p and a having finite 
rank, then the work one can extract with certainty 
except with e probability respects 



□ 



Proof Define p%, = p. W.l.o.g. s' = {0, . . . , 0} (the 
first probability distribution is independent of the path 
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II. RECOVERING THE RELATIVE 
MIN-ENTROPY 



that it is also e-near to p: 



We now recover the result of eq.j2] which, as discussed 
in the main body, was given in |131 114] . For completeness 
we repeat the equation also here: 

W = kT\n{2)D'o{cT\\pT). 

To do this we first need the corresponding expression for 
the Gibb's rescaled entropies: 

In the case where one starts at any state P ona system 
with only degenerate energy levels and ends up in the 
thermal state / = (2~", . . . , 2~"), we would expect the 
maximal work that can be extracted by taking a risk e to 
be kT \n{2) {n — H^^^{P)), as seen in [F for the case of 
Szilard engines. Indeed this is the result. More generally: 

Theorem 6. 

^kT \n{2) (i7^ax(<z) - i^^,axb)) 

where p = G'^(p) is the Gibb's rescaled probability distri- 
bution corresponding to the initial state p and q = G^{a) 
is the one corresponding to the final state a. 

For the proof of this theorem a technical lemma on the 
smooth max-entropy is needed. 

Lemma 7. Let p be a montonously falling probability 
function on [0, oo) and de be defined through 



6{p,p^) 



\p'^{x) — p{x) \ Ax 



\p'^[x) — p{x)\dx + I p{x)dx 



{p'^ {x) — p{x)) A X + I p{x)dx 



1— / p(x)da;+ / p{x)dx 



p{x) dx 



p{x)/{l-e)dx = 1 



< e 



Ther. 



2^max(P) 



Proof. Let d^ be defined as above. We need to show two 
things: 

i) 3p'^ probability function on [0,oo) with || supp(p'^)|l = 

de and trace-distance 6{p,p'^) < e. 

ii) II supp(p'^|| > d^ y p'^ monotonously decreasing prob- 

ability functions on [0, oo) with 5{p^p'^) < e. 

Then we get that H^^^{p) = 
log2 (min5(p_p.)<e(|| supp(p')ll)) = log2(de), as said 
in the lemma. The proof of i) goes as follows: Define 

p'^{x) = p{x) (^J^^ p(x)j for X < d^ and p'^{x) — 

for X > d^. This p"^ is therefore normalized to one, has 
support [0, depsiion] SLud the following equation shows 



which concludes the proof of i) . 



For the proof of ii) assume, that: 3p'^ like above, s.t. 
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supp(p'^)| < de, then (with 6 = 9o,, 



\p'^{x) — p{x) \ dx 



Ip"^ (x) — p{x)\ d X + / Ip"^ {x) — p{x)\ d X 

J " V- ' 

p(x) 



extraction reads (using the lemma) 
dx > — dx 



1 - e 



I 



supp 



d, J 1 - e 





I supp(g)|| II supp(p')|| 



> 



e+ / e{p{x)~p'{x)){p{x)-p'{x))dx 
Q {p'^{x) — p{x)) {p'^{x) — p{xy)dx 



,0 



> 



1 



e + J e{p{x)~p''{x)){p{x)~p''{x))dx 



OO OO 

-J- J Q{p{x)-p\x)) 

rf, 

{P^ix) -p{x))dx\ 



1 

> 

- 2 



e + / Q{p{x)-p^{x)){p{x)-p''{x))dx 



>o 



q{x) dx 



The above is an equation in the case I — d^. Which shows 
that the maximal w as defined in theorem [l] is given by 

^ ^ l|SUPp(g)H ^ 2(^»-(«)-«»a.(P)) 

II supp(p')|| 

□ 

Equation [2] can than be seen as a corollary of this the- 
orem: 

Corollary. Eq. is a special case of the above theorem, 
recovered when the final state is a Gibb 's state and has 
also the same energy eigenvalues as the initial. 

W = kT\n{2)D'Q{p,a^) 

Proof. Let p be the Gibbs-rescaled probability function 
corresponding to p and P{j) the eigenvalues of p. Let a 
be the flat energy probability function corresponding to 

a . Let A{j) = — z^^~^^ where i?(j) are the energy- 
eigenvalues of p and and Z is the corresponding par- 
tition function. This means by definition, that 



(1-1)+ / Q[p[x)-p\x)){p{x)-p\x))dx 




p\Zj A 





y ■ n 
A 



dy 



A{\^])Z 



and likewise a{x) = 1/Z (both defined for x G [0, Z] 
From the above theorem we get: 

W = fcrin(2)(i7,„ax(a)-i?lax(p)) 

= fcrin(2) flog2(Z)-log2 f inf supp (p) 



= -fcTln(2)log2 



> e 



which is a contradiction to S{p,p'^) 



Uj\p^{x)^p{x)\)< 



□ 



mm 

l^x\fp(\yn-])dv>l~(.^ 

= kT\n{2)Rl{P,A) 



Z J Ai\yn])dy 



□ 



Now we have all we need to proof the theorem above: 

Proof, let p'^ be a probability function with the smallest 
possible support such that S{p,p'^) < e and define c?^ as in 
lemma [Tj For I < d^ the requirement for maximal work 
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III. ENTROPY INCREASE LAW 

We now show that our thermahsation model indeed 
respects AS' > /3A{E), where S is the Von Neumann 
entropy, /? the inverse temperature associated with the 
reservoir, and (E) = J^i the expected internal en- 
ergy. 

Lemma 8. In the model for thermalisation used here the 
following is always respected: 



AS > /3A{E). 



(8) 



Proof. We firstly recall the model and define certain no- 
tation. 

Recall that the thermalisation model states that when 
two levels, 1 and 2, are coupled to the heat bath, their 
ratio A1/A2 gets closer to exp(— — E2)), and the 
other A's are untouched. In our model one may concate- 
nate several such interactions to implement any allowed 
multi-level interaction with the bath. It will therefore 
suffice to show that Eq. [8] holds for a single two-level 
interaction with the heat bath. 

For notational convenience let the probability of be- 
ing in level 1 or 2 be called A12 := Ai + A2. This is 
then constant for the given two-level interaction with the 
bath. In the extreme case of the two levels interacting 
with the bath for an arbitrary amount of time we have 
Ai := Xj and A2 := A^ (T reminds us of the temperature 
dependence). These values must then obey the relation 



Af /Af = exp(-/3(i?i - E2)) 



(9) 



We also assume without loss of generality that E2 < Ei. 
This implies that Xj < 0.5Ai2. 

Now we begin to prove the statement. Firstly we sim- 
plify SS by noting that only two levels change their prob- 
abilities. We write 

S = -^AjlogA; 

i 

= -AilogAi - (A12 - Ai)log(Ai2 - Ai) - ^ AilogAj 

'imax 

= 'S'12 - ^ AilogAi. 



i=3 



j=3 



We see that in any two-level interaction 



AS ^ AS 



12- 



(10) 



It is helpful to reexpress S12 in terms of an actual en- 
tropy S'12, so that we can use known properties of en- 
tropies to make statements about S'12. We let Ai :~ 
A1/A12 and A2 A2/A12 such that Ai + Ai = 1. We 
define 

S12 -AilogAi - A2logA2. 



One can then see in a few lines of algebra that 
S12 = A12S12 — A12 log A12 

It follows that 

AS12 = AiaAS^. (11) 

We accordingly now want to show that A12AS12 > 
(3A{E). 

We can now use the well known property of the shan- 
non/von neumann entropy: S12 is concave in Ai = 
A1/A12. The function is accordingly upper bounded by 



any tangential line, as in Figure III Consider the tan 




Figure 2; The entropy S'12 is a function of Ai. The red dot 
corresponds to the thermal state in question, i.e. Ai = Ai . 
The tangential upper bound has gradient /3(i?2 — Ei)- 



gential line at Ai = A^. At that point it follows from a 
few lines that 



dXT^^^'^-^^ - dx: 



Si2 = P{Ei-E2). (12) 



Note now that {E) may similarly to the entropy be 
written as 

{E) = - Y,X.,E, 

i 

= {E)i2 + {E)rest, 

such that A(£;) = A(£;)i2 = {AXi){Ei - E2), with 
AAi = A'j^ — Ai the change in Ai. So {E){Xi) is a line 
with gradient given by 



A(g) 
AAi 



El — E2. 
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Similarly 

^^^^ ^ ^ (F F ) 

Comparing this with the gradient of the tangential line 
to 5*12 in Eq. 12 we see that -^f3{E)i2 has the same 



gradient as thelangential line. We therefore only need 
to show that the change in the tangential line is up- 
per bounded by the change in the entropy curve, as it 
is equivalent to showing that AS'12 > ■^/3{E)i2. This 

must hold for all possible initial and final values of Ai and 

T 

all possible values of Ai (recall that we assumed without 

T 

loss of generality that Ai > 0.5 ). These can be grouped 
into three cases. 

— — T 

1. Ai < Ai . Here the tangential bound above implies 
that AS^> ^{E)i2 > 0. 

2. Ai"^ < Ai < 0.5. Here the tangential bound implies 
that > AS^> ^{E)i2. 

3. Ai > 0.5, also after the interaction. Here the tan- 
gential bound implies that AS'12 > > -^{E)i2. 



This implies the lemma. 



□ 



